Device and method for carrying out multichannel acoustic echo cancellation with a variable number of channels

ABSTRACT

A multichannel full-duplex audio signal transmission system, comprising an adaptive filter ( 2 ) provided for multichannel acoustic echo cancellation. A channel combining device ( 5 ) is provided between the preprocessing units (V 1 , . . . VD) and loudspeakers (L 1 , . . . ,LD), in which several (C&gt;D) loudspeakers can be connected to one and the same preprocessing unit and by means of which the remaining D-C preprocessing units can be separated from the loudspeakers (L 1 , . . , LD). The channel combining device permits an optimization of the convergence ratio of the adaptive adjustment of filter coefficients in said adaptive filter ( 2 ).

BACKGROUND OF THE INVENTION

The present invention concerns a device and a method for multichannelacoustic echo compensation with variable number of channels as they areused especially for acoustic human-machine interfaces with hands-freedevices and multichannel output, in order to make multichannelfull-duplex communication possible.

The basic problems of acoustic echo compensation are described in detailin the review article “Stereophonic Acoustic Echo Cancellation—AnOverview of the Fundamental Problem”, IEEE Signal Processing Letters,Vol. 2, No. 8, August 1995, by M. Mohan Sondhi et al.

If only a single full-duplex audio channel is used for bi-directionalspeech transfer between a first as well as a second audio transmissionand receiving unit in acoustic human-machine interfaces, for example,microphones, loudspeakers in video conference systems or telephoneconference systems, then, an acoustic echo compensation can be performedby using adaptive filters in order to suppress undesirable echoes whicharise from feedback between loudspeakers and microphones in the firstand second audio transmission and receiving units.

In conventional single-channel acoustic echo compensators, the use of asingle FIR (finite impulse response) filter with adaptive adjustablefilter coefficients is sufficient to model the acoustic pulse responseof the echo path. An estimated signal for the echo modeled by theadapted filter is then deducted from the actual echo signal to obtain anerror signal, which is adjusted to the echo path which may possiblychange in the course of time, by permanent adaptive continued regulationof the filter coefficients, so that the error signal is continuouslykept as low as possible.

However, especially in video conference or telephone conferencetransmissions, it may be desirable, using of several acoustictransmission channels, each with at least one assigned loudspeaker, totransfer an acoustic pattern which is as true to the room as possible,from a first to a second audio transmission and receiving unit. Forexample, this is of interest, when several speakers are located in afirst room, from whom the speech sound is to be transferred to areceiver in a second room. If one then uses two or more acoustictransmission channels to a second room, where a listener is located,then this listener receives a stereo or multichannel acoustic patternfrom the first room, which makes it easier for him, for example, toassign the speech sound to the individual speakers.

As explained by the above review article, for example, also in “StereoProjection Echo Canceller with True Echo Path Estimation”, Proc. IEEEInternational Conference on Acoustics, Speech and Signal Processing(ICASSP 95), Detroit, Mich., USA, PP. 3059–3062, May 1995, by S.Shimauchi et al. or “A better understanding and an improved solution tothe problems of stereophonic acoustic echo cancellation”, Proc. IEEEInternational Conference on Acoustics, Speech, and Signal Processing(ICASSP 97), Munich, pp. 303–306, April 1997, by J. Benesty et al.,however, due to the mutual influence of the individual transmissionchannels among one another, in the case of stereo or multichannelcompensation, a number of additional problems occur in comparison to themono-channel situation, where an individual adaptive filter issufficient for echo compensation.

Various solution sets for problems that occur in the multichannel caseare especially explained in the article “Stereophonic Acoustic EchoCancellation—An Overview and Recent Solutions”; Proc. 6^(th) Int.Workshop on Acoustic Echo and Noise Control, Pocono Manor, Pa., USA pp.12–19, September 1999, by S. Makino et al. Individually, the followingare dealt with: addition of statistically independent noise signals tothe loudspeaker signals, nonlinear signal processing; the use ofdecorrelation filters, the use of various time-variable filtertechniques, and the use of special adaptive algorithms in the filters.

Especially in the multichannel case, according to our state of knowledgetoday, signal processing for partial (not detectable) decorrelation ofthe loudspeaker signals is necessary in order to make unequivocalconvergence of adaptive filters to the true room pulse responsespossible. As already stated, the basic idea of echo compensation is tosimulate, using digital filter structures, the echo paths which arisefrom the interplay of certain loudspeaker characteristics, a certainroom acoustics and a certain microphone characteristics.

This will be explained below in more detail with the aid of FIG. 3. Inthe case of the echo compensation device according to the state of theart shown there, the audio signals emitted by a multichannel audiosignal processing unit 1, are sent through separate loudspeaker channelsLK1, . . . , LKD to the corresponding loudspeakers L1, . . . , LD. Achannel-specific pre-processing unit V1, . . . , VD is located in eachsection of the loudspeaker channels LK1, . . . , LKD. The audio signalsrunning through the pre-processing units V1, . . . , VD can each belocked there individually in a channel-specific manner.

The loudspeakers L1, . . . , LD assigned individually to loudspeakerchannels LK1, . . . , LKD emit acoustic signals corresponding to thereceived audio signals into the surrounding room.

Furthermore, a microphone M is provided which serves as input interfacefor acoustic signals, for example, speech sounds from a person speakinginto the microphone.

The microphone M converts the received acoustic signals into microphonesignals, which are sent back to the multichannel audio signal processingunit 1 through a microphone channel MK for further processing.

The acoustic signals radiated by loudspeakers L1, . . . , LD aresuperimposed depending on the structures in the room, in which theloudspeakers L1, . . . , LD are set up, and are also received bymicrophone M.

As a result of this, echo signals are produced, because the acousticsignals emitted by the loudspeakers L1, . . . , LD are received by themicrophone M, from there are sent to the multichannel audio signalprocessing unit 1, from where, under certain circumstances, are sentagain to loudspeakers L1, . . . , LD.

The basic idea of echo compensation is to compensate by digital filterstructures the “echo paths” arising from the interaction of the acousticsignals emitted by the loudspeakers L1, . . . , LD and from theirdifference paths predetermined by the spatial propagation conditions tomicrophone M and by the microphone characteristics. This occurs by thefact that such digital filter structures produce estimate signals forthe echo signals expected through the echo paths and that the estimatesignals are subtracted from the microphone signals which contain theactual echo signals.

If there was exact agreement between the real room pulse responses andthe pulse responses of the digital filter, the echo signals would beextinguished in the microphone signal.

However, since the echo paths generally have a very complex structurewhich is not known beforehand and which, in addition, can change intime, the echo paths must be continuously reidentified, that is,adaptively identified.

The adaptive filter 2 shown in FIG. 3 serves this purpose: the audiosignals entered through channels LK1, . . . , LKD to loudspeakers L1, .. . , LD are introduced to this filter through branch lines A1, . . . ,AD. In the adaptive filter 2 the audio signals introduced through branchlines A1, . . . , AD are superimposed on weighting coefficients (filtercoefficients) to be optimized, according to specified adaptationalgorithms. The adaptive adjustment is based on mathematical modelswhich provide adjustment of the temporarily valid filter coefficients tothe temporarily valid echo path conditions.

In order to make unequivocal convergence of the filter coefficients tothe true room pulse responses possible in the multichannel case, thesignal pre-processing, which is necessary according to our present-dayknowledge (see, for example, the article by J. Benesty et al. mentionedabove) for partial (acoustically not detectable) decorrelation of theloudspeaker signals, is carried out in the preprocessing units V1, . . ., VD shown in FIG. 1.

However, it can be shown theoretically and experimentally that, in spiteof this preprocessing, the expenditure for echo compensation generallyincreases with increasing number of channels and the convergencebehavior of the individual channel signals to be superimposed in theadaptive filter becomes worse. If D different preprocessing units areused then this leads to very slow convergence of the filter coefficientswhen the actual number of channels C of the audio signal is smaller thanthe actual number of channels D, that is, when C<D. This case is typicalfor the use in multimedia terminal equipment (for example, when amultimedia terminal equipment is used as stereo television unit, withwhich a broadcast is considered in which the tone is displayed only withone mono-channel.

The performance of multichannel echo compensation for acousticinterfaces in multimedia terminals is a relatively new application.Conventional attachments for telephone conference applications provide afixed channel number, D, for the audio signals.

The relatively slow convergence behavior arises in this case byinsufficient decorrelation of originally exactly the same audio signalswhich are passed through separate audio channels.

The solution set known from the article by J. Benesty et al. cited aboveas state of the art provides D equal nonlinear preprocessing units, as aresult of which the above problem is lessened. In any case, in this waythe decorrelation possibilities are also limited, especially when thesignals of the individual channels differ mainly in their levels (forexample, in case of intensity stereophony).

SUMMARY OF THE INVENTION

Therefore, the task of the present invention is to overcome thedisadvantages of the devices known from the art for multichannelacoustic echo compensation with a variable number of channels.

Especially, it is the task of the present invention to provide a devicefor multichannel acoustic echo compensation with a variable number ofchannels for the case in which the actually-used number of channels, C,is smaller than the number of actually present channels, D, and wherethe problems arising in connection with decorrelation in connection withthe state of the art are avoided.

Furthermore, it is a task of the present invention to provide methodsfor multichannel echo compensation where the number of channels used, C,is smaller than the number of channels actually present, D.

The approach according to the invention for echo compensation in thereproduction of C-channeled audio signals on a D-channel system (C<D)makes use of the fact that the number of channels of the audio signal isknown (for example, when stereo information is present in a televisionsignal). Therefore, it is possible to decorrelate only the C<Dactually-used audio channels through independently operatingpreprocessing units. The remaining D-C loudspeaker signals are thencombined only with the actually-used C audio channels (for example, inthe mono case both loudspeaker signals are connected to channel 1 of astereo system).

BRIEF DESCRIPTION OF THE DRAWINGS

The advantages and characteristics of the present invention follow fromthe explanation of preferred practical examples given below incombination with the drawings.

The following are shown:

FIG. 1 shows a schematic representation of a first embodiment of adevice according to the invention for multichannel echo compensation.

FIG. 2 shows a schematic representation of a second embodiment of adevice according to the invention for multichannel echo compensation.

FIG. 3 shows a schematic representation of a known device formultichannel echo compensation according to the state of the art.

DETAILED DESCRIPTION

A first embodiment of a device according to the invention for echocompensation will be explained below as an example based on FIG. 1.

Here, elements which were already explained in combination with thestate of the art according to FIG. 3 are provided with identicalreference numbers to those in FIG. 3 and will not be explained in moredetail below.

In addition to the elements shown in FIG. 3, FIG. 1 shows the firstembodiment of a device according to the invention and a channelcombination device 5 which is provided between the D preprocessing unitsV1, . . . , VD and the branch lines A1, . . . , AD leading to theadaptive filter 2.

Furthermore, a data line 8 is provided between the multichannel audiosignal processing unit 1 and the channel-combination device 5. Themultichannel audio signal processing unit 1 transmits through data line8 the C channels actually to be used to channel-combination device 5,and this number of channels can be smaller than the number D of thetotal channels actually present.

Using channel-combination device 5, always several loudspeakers, whichare supposed to receive exactly the same audio signals, are connected toa single common inlet line, and namely according to the number C ofchannels actually to be used, which is provided by the multichannelaudio signal processing unit 1 to the channel-combination device 5. Thechannel-combination device 5 decouples then the unnecessary D-Cpreprocessing units from the loudspeakers. In the most general case,this is done by simply connecting several loudspeakers with an inletline in the channel-combination device 5. Thus, the unnecessary D-Cpreprocessing units are decoupled from the loudspeakers.

In other words:

Through D loudspeaker channels LK1, . . . , LKD, the loudspeaker channelsignals LS1, . . . , LSD entered by the multichannel audio signalprocessing installations 1 are combined with one another in thechannel-combination device 5 by superimposing individual loudspeakersignals to one another loudspeaker signals so that at the exit of thechannel-combination device, only C<D independent output signals arepresent.

This will be explained in the following example: in case of a reductionfrom seven input loudspeaker channel signals LS1, LS2, LS3, LS4, LS5,LS6, LS7 to four signals LS1, LS23, LS4, LS567, the entering loudspeakerchannel signals LS1 and LS4 are left unchanged, but the loudspeakerchannel signals LS2 and LS3 are combined to a signal LS23 and theloudspeaker channel signals LS5, LS6 and LS7 are combined to a signalLS567. These four output signals LS1, LS23, LS4 and LS567 can then beintroduced, for example, to the seven loudspeakers that were provided inthis case as follows:

LS1 to L1, LS23 to L2 and to L3, LS4 to L4, LS567 to L5, L6 and L7.

With the measures according to the invention, the additional convergenceproblems of the filter coefficients are avoided, which do occur in theconventional multichannel echo compensation with loudspeaker signals atreduced number of channels.

When using a device according to the invention for multichannel echocompensation, in which, using a channel-combination device, only C<Daudio channels are actually utilized, the performance that can beachieved with a D-channel echo compensator (D>C) is comparable to thatachievable with a conventional only C-channel echo compensator. All thisis possible with an extremely small additional expenditure, namely byproviding the said channel-combination device 5.

The approach according to the invention is independent of the actualadjustment algorithm used, of the actual preprocessing method used, andof channel number D of the system.

For echo compensation in the case of C channels, in a device accordingto the invention, a maximum of C of the actually-present D preprocessingunits are used.

In order to achieve maximum efficiency, exactly C differentpreprocessing units must be used.

A second embodiment of the device according to the invention for echocompensation is explained now in more detail with the aid of FIG. 2. Inthis embodiment, the elements shown in FIG. 1 are complemented by anintermediate buffer 6 as well as by a transfer logic 7. The intermediatebuffer 6 is in connection with a transfer logic 7 through abi-directional bus line 9, and the transfer logic is again in connectionwith the adaptive filter 2 through a bi-directional bus line 10. Inaddition, the transfer logic 7 is connected to the channel-combinationdevice 5 through a unidirectional bus line 11.

Intermediate buffer 6 serves for storage of estimated pulse responseswhich had been determined previously by the adaptive filter 2 and whichwere transported through the bi-directional bus line 10 into thetransfer logic 7 and from there, through bi-directional bus line 9 intointermediate buffer 6.

In a system with D loudspeaker channels and an adaptive filter 2, inwhich a number L of filter coefficients is provided for each loudspeakerchannel, sufficient memory must be present in the intermediate buffer 6in order to be able to store L filter coefficients for the maximumnumber of the D channels used. That is, the possibility must exist tostore D L estimated filter coefficients.

The transfer logic 7 receives from the channel-combination device 5through bus line 11 the indices of the presently-used channels, thenumber of which is smaller than or equal to the number D of theactually-available channels.

The meaning of such buffer storage of estimated pulse responses (filtercoefficients) is the following: if one changes from a number of channelsX originally used during an operational phase a to a different numberand from a number of channels Y during an operational phase b, and againduring a following operational phase c change back to the number ofchannels X, then, at the beginning of operational phase c, the filtercoefficients already used until the end of operational phase a can berecaptured as starting values for renewed adjustments necessary due toany changes in room acoustics that could have occurred in the meantime.

In order to make this procedure more understandable, let us discuss, forexample, the following scenario: in a multimedia television system with5-channel Dolby surround-sound installation, certain televisionbroadcasts (for example, feature films) are received with a 5-channeltone. Other television broadcasts (for example, commercials ornewscasts) are received, however, for example, with only 2-channeltones, or even with 1-channel tone (mono). The reduced number of tonechannels were then equally reproduced through the 5-channel Dolbysurround-sound installation. This occurs, as explained above, by thecombination of individual loudspeaker channel signals to combinationsignals.

If now a viewer first views, for example, a television broadcast with5-channel tones, then when using a device according to the invention anda method according to the invention, multichannel echo compensation isutilized for a given set of acoustic conditions in the room for thedetermination of certain filter coefficients in the adaptive filter 2shown in FIG. 2. Now, if the viewer now, for example, switches from thejust-viewed television broadcast with 5-channel tones to anothertelevision broadcast with 2-channel tones (stereo tone), then anadaptive adjustment must be carried out again for the 5 signals emittedby the channel combination device 5, that is, 2 new filter coefficientsfor the 2-channel case must be calculated in the adaptive filter 2. Ifthe viewer then switches back again to the originally-watched televisionbroadcast with 5-channel tones, then adjustment of the adaptive filtersfor the 5-channel case is necessary again. If the room acousticconditions in the meantime were unaltered, then the adaptive filter 2now will find the same filter coefficients for the 5-channel case whichwere present before switching from the 5-channel tone broadcast to the2-channel tone broadcast. In order to save the time period that theadaptive filter needs to converge again to the filter coefficientssuitable for the 5-channel case, with the aid of the measures accordingto claim 6, one can simply use again the filter coefficients that weresuitable before switching from the 5-channel transmission to the2-channel transmission at constant room acoustic conditions as before,which were stored in buffer 6 for intermediate storage.

Even when during the time span until the renewed switching back to the5-channel transmission, a change would have occurred in the acousticconditions in the room (for example, because people left the room orcame in), in practice it should be assumed that these changes are soslight that the filter coefficients which were stored in buffer 6 wouldstill be relatively suitable for the new acoustic conditions in theroom, and thus would be very good starting values for a renewedadjustment process of the adaptive filters 2, so that, based on thepredetermined start values, the time duration needed for reaching aconvergent state of the filter coefficients is usually significantlyshorter than when the adaptive filter with arbitrary start values wouldhave to perform complete new calculation of the adaptive filtercoefficients for the 5-channel tone case with changed room acousticconditions.

This method of buffer storage of previously-determined filtercoefficients naturally makes sense even when first the switch is from asmaller number of audio channels used (for example, 2) to a largernumber of audio channels used (for example, 5) and then again switchingback to the original smaller number.

For the compensation unit even at C<D independent audio channels, aD-channel adaptive filter is used since the computing capacity wouldhave to be dimensioned for D channels anyway in order to be able tocover even the case when all D channels are to be used.

If the other D-C loudspeaker signals are combined with the Cactually-used audio channels, then all physically correct echo pathscould no longer be identified separately; however, this is not necessaryin this case since the correlation between loudspeakers that areconnected directly to one another cannot be altered.

REFERENCE LIST

-   1 Multichannel audio signal processing unit-   2 Adaptive filter-   5 Channel-combination device-   6 Buffer-   7 Transfer logic-   8 Data line-   9, 10 Bi-directional bus line-   11 Unidirectional bus line-   A1, . . . , AD Branch lines-   L1, . . . , LD Loudspeakers-   LK1, . . . , LKD Loudspeaker channels-   LS1, . . . , LSD Loudspeaker channel signals-   M Microphone-   MK Microphone channel-   V1, . . . , VD Preprocessing units-   a, b, c Operational phases-   X, Y Number of channels used

1. Device for multichannel acoustic echo compensation for acousticinterfaces, which includes the following: a multichannel audio signalprocessing unit (1); a number, D, audio signal channels (LK1, . . . ,LKD) going out from the multichannel audio signal processing unit (1);wherein each audio signal channel has a respective audio signalpreprocessing unit (V1, . . . , VD) assigned to the respective audiochannel; wherein at least one loudspeaker (L1, . . . , LD) is assignedto each respective audio signal channel; wherein one branch line (A1, .. . , AD) is branched off to a D-channel adaptive filter (2) from eachaudio signal channel between the particular assigned preprocessing unit(Vi, . . . , VD) and the particular assigned at least one loudspeaker(L1, . . . , LD); a microphone (M) connected to the adaptive filter (2);a microphone channel (MK) leading back from the adaptive filter to themultichannel audio signal processing unit (1); a channel combinationdevice; wherein a preprocessing unit (V1, . . . , VD) is assigned toeach audio channel; and wherein the channel-combination device (5) isarranged between the preprocessing units (V1, . . . , VD) and theloudspeakers such that in the channel-combination device (5), a number,C, of loudspeaker channels can be combined with fewer than the number,D, of audio signal channels.
 2. Device according to claim 1, furthercomprising: a transfer section (8) between the multichannel audio signalprocessing unit (1) and the channel-combination device (5), throughwhich the C number of channels actually to be occupied by themultichannel audio signal processing unit (1) can be transmitted to thechannel-combination device (5).
 3. Device according to claim 1, furthercomprising: a transfer logic (7) which communicates with thechannel-combination device (5) and the adaptive filter (2), and anintermediate buffer (4) which communicates with the transfer logic (7).4. Device for multichannel acoustic echo compensation according to claim3, wherein the intermediate buffer has a buffer capacity for D×L filtercoefficients transmitted by the transfer logic, where D is a pluralnumber of channels of the system and L is a number of filtercoefficients for a particular channel.
 5. Method for multichannelacoustic echo compensation for acoustic interfaces wherein a number D ofloudspeaker channel signals are always subjected to signal preprocessingbefore they are transmitted to the loudspeakers (L1, . . . , LD); theloudspeaker channel signals are additionally branched to a device (2)for adaptive filtering of loudspeaker signals, wherein the branchedloudspeaker signals are subjected to adaptive adjustment to produce anecho compensation signal, which is deducted from a microphone signal forthe purposes of minimization of the echo and the microphone signalecho-minimized in this way is transmitted to a multichannel audio signalprocessing unit (1) for further processing and renewed output asloudspeaker channel signals; after signal preprocessing, individual onesof the D loudspeaker signals are so combined that only C<D combinationsignals remain which are transmitted through the assigned C loudspeakerchannels to D loudspeakers, and that only these C combination signalsare subjected to adaptive adjustment.
 6. Method according to claim 5,which further comprises, as additional steps, that before a change of anumber of X actually-used channels to Y≠X actually-used channels, filtercoefficients which are already identified for the X channels foradaptive adjustment to produce an echo compensation signal of anadaptive filter (2), are subjected to intermediate storage, and thatafter changing back the number of channels from Y channels to the numberof X channels, the intermediate-buffered filter coefficients are used asstart values for the necessary recalculation of filter coefficients fora renewed adaptive adjustment, in order to accelerate the convergencefor further adjustment.